-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/neural nets factory and cookbook #4386
Feature/neural nets factory and cookbook #4386
Conversation
Looks good, how does it work? |
|
||
#![train_and_apply] | ||
network.train(features_train) | ||
Labels labels_predict = network.apply_multiclass(features_train) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
apply
#![add_layers] | ||
|
||
#![create_instance] | ||
NeuralNetwork network(all_layers) | ||
NeuralNetwork network = neural_network("NeuralNetwork", labels=labels_train) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neural network is a machine, so this could be instantiated using machine
This means that the below methods set_layers
, quick_connect
, and initialize_neural_network
all would have to be changed.
set_layers
can be done via put (already now!) just needs a dispatcher for theCDynamicObjectArray
in put.quick_connect
, andinitialize_neural_network
could just be called automatically inside thetrain_machine
method which should be a minor refactor
network.train(features_train) | ||
BinaryLabels labels_predict = network.apply_binary(features_test) | ||
Labels labels_predict = network.apply_binary(features_test) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
apply
src/shogun/util/factory.h
Outdated
@@ -34,6 +36,8 @@ namespace shogun | |||
CECOCEncoder* ecoc_encoder(const std::string& name); | |||
CECOCDecoder* ecoc_decoder(const std::string& name); | |||
CTransformer* transformer(const std::string& name); | |||
CNeuralLayer* neural_layer(const std::string& name); | |||
CNeuralNetwork* neural_network(const std::string& name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would rather not have a network factory, but instead use machine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! I think this is a good basis. Just need the refactoring to clean things up a bit
Please also update to only using 2 classes (and then try less data and also pls post the runtime of the meta example) |
{ | ||
quick_connect(); | ||
} | ||
if (!m_is_initialized) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is cleaner since it will avoid multiple calls to initialize_neural_network
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
although ....
why would there be multiple calls? Due to multiple "train" calls?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also would need to make sure that these things are set to false if the network architecture is changed.
So actually, it might be that always calling these might be a better idea. The cost is neglectible.
Thoughts?
b8f7455
to
f680424
Compare
#![add_layers] | ||
|
||
#![create_instance] | ||
Machine network = machine("NeuralNetwork", labels=labels_train, max_num_epochs=4, epsilon=0.0, optimization_method=enum ENNOptimizationMethod.NNOM_GRADIENT_DESCENT, gd_learning_rate=0.01, gd_mini_batch_size=3, max_norm=1.0, dropout_input=0.5, layers=layers_conv) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice!
layers.softmax(2) | ||
DynamicObjectArray all_layers = layers.done() | ||
DynamicObjectArray all_layers() | ||
NeuralLayer input = neural_layer("NeuralInputLayer", num_neurons=num_feats) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder, could we go with the simpler name layer
?
src/shogun/base/base_types.h
Outdated
@@ -31,7 +31,8 @@ namespace shogun | |||
std::is_same<CLabels, T>::value || | |||
std::is_same<CECOCEncoder, T>::value || | |||
std::is_same<CECOCDecoder, T>::value || | |||
std::is_same<CMulticlassStrategy, T>::value> | |||
std::is_same<CMulticlassStrategy, T>::value || | |||
std::is_same<CDynamicObjectArray, T>::value> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exactly like this!
layers.linear(1) | ||
DynamicObjectArray all_layers = layers.done() | ||
int num_feats = features_train.get_int("num_features") | ||
DynamicObjectArray all_layers() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw instead of using the array type here explicitly, we should be able to use SGObject::add
, which allows to append to an internal array.
The call should be machine.add("layers", input);
etc. This means you would first create the network instance, and only then add the layers.
If this works, we dont need to add the array type to shogun's base classes, which I would prefer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see the multiple kernel learning example where the same thing is done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice!
if we can get rid of the array type, that would be better, Sorry I just only realised this.
The other thing is that we want to use 2 classes only
c6ce549
to
eb9290a
Compare
eb9290a
to
db95fb4
Compare
2817b53
to
11b3922
Compare
@@ -229,6 +234,13 @@ CDenseFeatures< float64_t >* CNeuralNetwork::transform( | |||
|
|||
bool CNeuralNetwork::train_machine(CFeatures* data) | |||
{ | |||
if (m_auto_quick_initialize) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this is cleaner since there are no states now :) the code will work like it did before
SG_ADD( | ||
&m_layers, "layers", "DynamicObjectArray of NeuralNetwork objects", | ||
MS_NOT_AVAILABLE); | ||
SG_ADD(&m_auto_quick_initialize, "auto_quick_initialize", "auto_quick_initialize", MS_NOT_AVAILABLE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs a better description.
"Whether the network is automatically connected and initialized on training."
/** True if the network layers are to be quick connected and initialized | ||
* initial value is False | ||
*/ | ||
bool m_auto_quick_initialize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls refer to the doxygen methods that are called here
#![set_parameters] | ||
#![add_layers] | ||
NeuralLayer input = layer("NeuralInputLayer", num_neurons=num_feats) | ||
network.add("layers", input) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes this is much nicer. Should have thought about it straight away :)
network.set_gd_learning_rate(0.1) | ||
network.set_gd_momentum(0.9) | ||
#![set_parameters] | ||
#![add_layers] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you change these references you will need to change the cookbook ones as well! It refers to these snippets!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right i will edit the cookbook too :)
layers.input(dimensions) | ||
layers.rectified_linear(20) | ||
layers.linear(1) | ||
DynamicObjectArray all_layers = layers.done() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this taken care of?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah yes it just returns the DynamicObjectArray so if we set that instead using add, all should be good
|
||
m_auto_quick_initialize = false; | ||
m_sigma = 0.01f; | ||
m_layers=new CDynamicObjectArray(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need to unref this in the destructor
@@ -772,7 +784,11 @@ void CNeuralNetwork::init() | |||
m_lbfgs_temp_inputs = NULL; | |||
m_lbfgs_temp_targets = NULL; | |||
m_is_training = false; | |||
|
|||
m_auto_quick_initialize = false; | |||
m_sigma = 0.01f; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the f is not necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from some minor bits, this is good!
File f_feats_test = csv_file("../../data/fm_test_mnist_16x16.dat") | ||
File f_labels_train = csv_file("../../data/labels_train_mnist_16x16.dat") | ||
File f_labels_test = csv_file("../../data/labels_test_mnist_16x16.dat") | ||
File f_feats_train = csv_file("../../data/classifier_3class_256d_linear_features_train.dat") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am confused by the "linear" in the filename.
Also I think it should still say "mnist"
What about mnist_3class_256d_features_train.dat
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes that is a better name :)
Happy to merge after the filename change |
b1cbac8
to
9aae0bb
Compare
9aae0bb
to
599d431
Compare
Tada :) |
* convolutional net meta example * data update * cnn integration test * use add api for layers * rename factory and initialize layers * convolutional layer cookbook and other updates
No description provided.